PredicT-ML: a tool for automating machine learning model building with big clinical data

نویسنده

  • Gang Luo
چکیده

BACKGROUND Predictive modeling is fundamental to transforming large clinical data sets, or "big clinical data," into actionable knowledge for various healthcare applications. Machine learning is a major predictive modeling approach, but two barriers make its use in healthcare challenging. First, a machine learning tool user must choose an algorithm and assign one or more model parameters called hyper-parameters before model training. The algorithm and hyper-parameter values used typically impact model accuracy by over 40 %, but their selection requires many labor-intensive manual iterations that can be difficult even for computer scientists. Second, many clinical attributes are repeatedly recorded over time, requiring temporal aggregation before predictive modeling can be performed. Many labor-intensive manual iterations are required to identify a good pair of aggregation period and operator for each clinical attribute. Both barriers result in time and human resource bottlenecks, and preclude healthcare administrators and researchers from asking a series of what-if questions when probing opportunities to use predictive models to improve outcomes and reduce costs. METHODS This paper describes our design of and vision for PredicT-ML (prediction tool using machine learning), a software system that aims to overcome these barriers and automate machine learning model building with big clinical data. RESULTS The paper presents the detailed design of PredicT-ML. CONCLUSIONS PredicT-ML will open the use of big clinical data to thousands of healthcare administrators and researchers and increase the ability to advance clinical research and improve healthcare.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automating Construction of Machine Learning Models with Clinical Big Data: Rationale and Methods

Background: To improve health outcomes and cut healthcare costs, we often need to conduct prediction/classification using large clinical data sets, a.k.a. “clinical big data,” e.g., to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning won most data science competitions and could support many clinical...

متن کامل

Hypertension Prediction in Primary School Students Using an Ensemble Machine Learning Method

Introduction: The prevalence of hypertension in children is increasing, and this complication is considered the most important risk factor for cardiovascular diseases in older age. Early detection and control of hypertension can prevent its progress and reduce its consequences. Machine learning methods can help predict this complication promptly and reduce cost and time. This study aimed to pro...

متن کامل

Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods

BACKGROUND To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many c...

متن کامل

Hypertension Prediction in Primary School Students Using an Ensemble Machine Learning Method

Introduction: The prevalence of hypertension in children is increasing, and this complication is considered the most important risk factor for cardiovascular diseases in older age. Early detection and control of hypertension can prevent its progress and reduce its consequences. Machine learning methods can help predict this complication promptly and reduce cost and time. This study aimed to pro...

متن کامل

Automated Machine Learning on Big Data using Stochastic Algorithm Tuning

We introduce a means of automating machine learning (ML) for big data tasks, by performing scalable stochastic Bayesian optimisation of ML algorithm parameters and hyper-parameters. More often than not, the critical tuning of ML algorithm parameters has relied on domain expertise from experts, along with laborious handtuning, brute search or lengthy sampling runs. Against this background, Bayes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2016